Bayesian constrained frequency warping HMMS for speaker normalisation

نویسندگان

  • Ching-Hsiang Ho
  • Saeed Vaseghi
  • Aimin Chen
چکیده

This paper presents a Bayesian constrained frequency warping technique. The Bayesian approach provides for inclusion of the prior information of the frequency warping parameter and for adjusting the search range in order to obtain the best warping factor dependent on HMMs. We introduce novel frequency warping (FWP) HMMs which are different warped versions of HMMs. Instead of frequency warping of the input speech we warp the spectrum of the HMMs. This is equivalent to HMMs which have both time and frequency warping capabilities. Experimentally FWP HMMs outperform the conventional constrained frequency warping approach. Furthermore, the best warping factor is estimated in two stages, a coarse stage followed by a fine stage. This method efficiently gauges the optimal warping factor and normalises the FWP HMMs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Frequency Warping for Speaker Adaptation in HMM-based Speech Synthesis

Speaker adaptation in speech synthesis transforms a source utterance to a target utterance that differs from the source in terms of voice characteristics. In this paper, we employ vocal tract length normalization, which is generally used in speech recognition to remove individual speaker characteristics, to speaker adaptation in speech synthesis. We propose a frequency warping approach based on...

متن کامل

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet transform, the input speech signal is decomposed into various frequency channels. For capturing the characteristic of the signal, the Mel-Frequency Cepstral...

متن کامل

Applying vocal tract length normalization to meeting recordings

Vocal Tract Length Normalisation (VTLN) is a commonly used technique to normalise for inter-speaker variability. It is based on the speaker-specifi c warping of the frequency axis, parameterised by a scalar warp factor. This factor is typically estimated using maximum likelihood. We discuss how VTLN may be applied to multiparty conversations, reporting a substantial decrease in word error rate ...

متن کامل

On combining vocal tract length normalisation and speaker adaptation for noise robust speech recognition

This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approac...

متن کامل

On Combining Vocal Tract Length Normalisation and Speaker Adapation for Noise Robust Speech Recognition

This paper investigates the combination of vocal tract length normalisation and speaker adaptation in connected digit recognition. In particular, we focus on performing this task under a continuously varying car noise environment. Continuous supervised speaker and environment adaptation is carried out on the test data according to the Bayesian framework. The paper also evaluates various approac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998